Accurate detection of de novo and transmitted indels within exome-capture data using micro-assembly

نویسندگان

  • Giuseppe Narzisi
  • Jason A. O'Rawe
  • Ivan Iossifov
  • Han Fang
  • Yoon-ha Lee
  • Zihua Wang
  • Yiyang Wu
  • Gholson J. Lyon
  • Michael Wigler
  • Michael C. Schatz
چکیده

We present a new open-source algorithm, Scalpel, for sensitive and specific discovery of INDELs in exome-capture data. By combining the power of mapping and assembly, Scalpel carefully searches the de Bruijn graph for sequence paths that span each exon. A detailed repeat analysis coupled with a self-tuning k-mer strategy allows Scalpel to outperform other state-of-the-art approaches for INDEL discovery. We extensively compared Scalpel with a battery of >10000 simulated and >1000 experimentally validated INDELs against two recent algorithms: GATK HaplotypeCaller and SOAPindel. We report anomalies for these tools to detect INDELs in regions containing near-perfect repeats. We also present a large-scale application of Scalpel for detecting de novo and transmitted INDELs in 593 families from the Simons Simplex Collection. Scalpel demonstrates enhanced power to detect long (≥20bp) transmitted events, and strengthens previous reports of enrichment for de novo likely gene-disrupting INDELs in autistic children with many new candidate genes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

De novo diagnostics of patients with intellectual disability

Germline coding de novo mutations (SNVs, indels as well as CNVs) are an important cause of moderate to severe forms of intellectual disability (ID) and associated syndromes. Exome sequencing now allows us to reliably identify these mutations using a single genomic test, and we have recently implemented exome sequencing in the diagnostic follow-up of these patients. In this presentation, I will ...

متن کامل

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

De novo insertions and deletions of predominantly paternal origin are associated with autism spectrum disorder.

Whole-exome sequencing (WES) studies have demonstrated the contribution of de novo loss-of-function single-nucleotide variants (SNVs) to autism spectrum disorder (ASD). However, challenges in the reliable detection of de novo insertions and deletions (indels) have limited inclusion of these variants in prior analyses. By applying a robust indel detection method to WES data from 787 ASD families...

متن کامل

Accurate HLA Typing at High-Digit Resolution from NGS Data

Human leukocyte antigen (HLA) typing from next generation sequencing (NGS) data has the potential for applications in clinical laboratories and population genetic studies. Here we introduce a novel technique for HLA typing from NGS data based on read-mapping using a comprehensive reference panel containing all known HLA alleles and de novo assembly of the gene-specific short reads. An accurate ...

متن کامل

Exome sequencing in the knockin mice generated using the CRISPR/Cas system

Knockin (KI) mouse carrying a point mutation has been an invaluable tool for disease modeling and analysis. Genome editing technologies using the CRISPR/Cas system has emerged as an alternative way to create KI mice. However, if the mice carry nucleotide insertions and/or deletions (InDels) in other genes, which could have unintentionally occurred during the establishment of the KI mouse line a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2014